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Abstract 

Background: Exposure measurement error is a concern in long-term PlVl2,5 health studies using ambient 
concentrations as exposures. We assessed error magnitude by estimating calibration coefficients as the association 
between personal P!^2.5 exposures from validation studies and typically available surrogate exposures. 

Methods: Daily personal and ambient PlVl2,5, and when available sulfate, measurements were compiled from nine 
cities, over 2 to 12 days. True exposure was defined as personal exposure to Pl\/l2,5 of ambient origin. Since PlVl2,5 of 
ambient origin could only be determined for five cities, personal exposure to total PlVl2,5 was also considered. 
Surrogate exposures were estimated as ambient PIVI2.5 at the nearest monitor or predicted outside subjects' homes. 
We estimated calibration coefficients by regressing true on surrogate exposures in random effects models. 

Results: When monthly-averaged personal PIVI2.5 of ambient origin was used as the true exposure, calibration 
coefficients equaled 0.31 (95% Cl:0.1 4, 0.47) for nearest monitor and 0.54 (95% Cl:0.42, 0.65) for outdoor home 
predictions. Between-city heterogeneity was not found for outdoor home PIVI2.5 for either true exposure. 
Heterogeneity was significant for nearest monitor PM2.5, for both true exposures, but not after adjusting for 
city-average motor vehicle number for total personal PIVI2.5. 

Conclusions: Calibration coefficients were < 1 , consistent with previously reported chronic health risks using nearest 
monitor exposures being under-estimated when ambient concentrations are the exposure of interest. Calibration 
coefficients were closer to 1 for outdoor home predictions, likely reflecting less spatial error. Further research is 
needed to determine how our findings can be incorporated in future health studies. 

Keywords: Exposure measurement error. Fine particles. Fine particles of ambient origin. Monitoring data. 
Spatio-temporal models 



Background 

Exposure measurement error is a limitation of epidemi- 
ologic studies of fine particles (PM2.5) [1-3], which gen- 
erally assess exposures using ambient concentrations 
measured at centrally located monitors. The impact of 
error on observed health risks can be substantial, poten- 
tially distorting associations and interactions between 
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covariates and outcomes, reducing the power to detect 
effects, and leading to invalid inference [3-5]. 

In time series studies, use of measurements from ambi- 
ent monitors, even in absence of any instrumental error, 
has been shown to introduce both a Berkson error com- 
ponent, a result of using aggregated instead of individual 
exposure data, and a classical error component, a result of 
the difference between the aggregated exposure data and 
the true ambient PM2.5 concentrations [5]. Berkson error 
would not bias the health effect estimates, but would lead 
to an increased variance, while classical error, conversely, 
can lead to bias [5,6]. It has been shown that in presence of 
multiple monitoring sites in a city, using across-monitor 
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averages, or population-weighted averages, would lead to 
less bias in time series studies [6,7]. Furthermore, less 
bias is expected when the pollutant of interest is spatially 
homogeneous, such as PM2,5 [6,8]. 

To minimize error, PM2.5 exposures would ideally be 
measured using personal monitors, with analytical meth- 
ods used to apportion these measurements into PM2,5 
constituents and sources. Such methods, however, are 
both expensive and intrusive and thus not feasible for 
studies conducted over long time periods with many 
subjects. Recently, for application to cohort studies, 
researchers have used statistical models to predict expo- 
sures outside participant residences [9-14], thus account- 
ing for spatial variation in ambient concentrations. While 
an improvement, such models still do not account for all 
sources of exposure variability, such as activity patterns, 
which can lead to biased results [15]. It can be argued, 
nevertheless, that such "biases" are the result of different 
target parameters of interest for the health effects of ambi- 
ent concentration vs. personal or ambient source exposure 
[16]. 

Further, spatial smoothing models can contribute a 
Berkson-like error component that results from smooth- 
ing the exposure surface and a classical-like component 
from variability in estimating model parameters [17-19]. 
The classical-like error can induce bias both towards 
and away from the null [18] with increased variability 
[18,20,21]. Even in absence of other error sources, nev- 
ertheless, health effects estimated using outdoor concen- 
trations will be attenuated proportionally to the PM2,5 
infiltration factor, the factor describing how much of the 
personal PM2.5 was generated outdoors, penetrated the 
building envelope and remained airborne [5]; if the expo- 
sure of interest is outdoor pollutant concentration rather 
than infiltrated personal exposure, however, it has been 
argued that this attenuation should not be regarded as a 
manifestation of measurement error [19]. 

Several previous acute effects studies have adjusted for 
exposure measurement error, showing that use of surro- 
gate exposures tends to bias the health effect estimates 
towards the null [22-24]. For long-term PM2.5 effects, 
a limitation in understanding the impact of measure- 
ment error on estimated health risks is the paucity of 
long-term personal exposure data [25-28]. We compiled 
exposure data from nine studies to estimate calibration 
coefficients for PM2.5 of ambient origin and total personal 
PM2.5 for cases when ambient concentrations or spatial 
models are used to assess exposures. In light of the com- 
plexity of measurement error in air pollution, the time 
scale of our validation data, and the uncertainty in our 
estimated calibration coefficients, our aim was to esti- 
mate and characterize calibration coefficients for PM2.5, 
but not to recommend their use to adjust health effect 
estimates in epidemiology studies directly. Our group is 



currently developing statistical methods to account for 
these limitations. 

Methods 

Personal exposure datasets 

We included data from studies of personal PM2.5 expo- 
sures based on the following criteria: i) the study had to be 
conducted in the United States, ii) during or after 1999, to 
ensure availability of PM2.5 concentration measurements 
at a EPA monitor located nearby, and iii) we had to be able 
to obtain the raw data, vs. the published summary statis- 
tics, from the investigators who originally conducted the 
study. 

Measurements of personal and ambient PM2.5 and, 
when available, sulfate (SO^"), were compiled from nine 
cities located throughout the United States (Table 1) 
[29-41]. A brief description of the validation studies is 
presented in the Additional file 1. 

In each study, daily personal PM2.5 exposure data were 
collected following panel study sampling designs. The 
number of subjects per study ranged between 15-201, 
with sampling session durations ranging from 2 to 12 
days (median: 7 days). For each subject, we estimated 
monthly average personal exposures and used these in our 
analyses. 

All subjects were non-smokers and were monitored 
in multiple seasons. Study subjects included the elderly, 
patients with myocardial infarction, children, and adults. 
All subjects younger than 18 years were excluded from the 
analysis, since long-term air pollution health studies are 
often focused on adult mortality. 

The current analysis was approved by the Human Sub- 
jects Committee of the Harvard School of Public Health. 
All participants provided informed consent according to 
the protocols of the original studies. 

Surrogate exposures 

For each subject, we calculated two monthly PM2.5 sur- 
rogate exposures. First, we determined monthly ambient 
PM2.5 concentrations from the nearest US Environmental 
Protection Agency (EPA) Air Quality System monitor 
(nearest monitor), restricting the maximum allowed 
monitor-residence distance to 30 mi [42]. Monthly con- 
centrations were estimated using all available data within 
the month, i.e. not only the days used for the monthly 
averages of the personal exposures. 

Second, we estimated monthly outdoor PM2.5 concen- 
trations outside each subjects residence, at the latitude 
and longitude of each subject s residence (at the zip code 
level for RIOPA subjects), using a nationwide expansion 
of a geographic information system (GIS) -based spatio- 
temporal model [14,43]. This model predicts monthly 
PM2.5 concentrations using a generalized additive model 
that fits monitoring data from governmental and research 
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Table 1 Validation studies used in our analyses 



Cities 


Yesrs 


# Subjects 


Sample session 
duration 


Age 
Mean (S.D.) 


References 


Atlanta, GA 


1 999-2000 


31 


7 days 


65.0(13.5) 


[36] 


Baltimore, MD 


1998-1999 


35 


1 2 days 


70.8 (7.7) 


lD\J,DD,D / ,~jO] 






Boston, MA 


1 999-2000 


56 


7 or 12 days 


62.3(14.1) 




Los Angeles, CA 


2000-2002 


37 


7 days 


56.3 (13.9) 


[34,35] 


RTP, NC 


2000-2001 


37 


7 days 


64.5 (7.8) 


[40,41] 


RIOPA 


1999-2001 




48 hr 




[32,39] 


Los Angeles, CA 




73 




46.1 (18.6) 




Elizabeth, NJ 




57 




48.2(17.8) 




Houston, TX 




62 




48.5 (16.6) 




Seattle, WA 


2000-2001 


89 


10 days 


76.7 (6.5) 


[31] 


Steubenville,OH 


2000 


28 


48 hr/wefor 12 we 


71.0(10.0) 


[29] 



networks together with GIS -based covariates, including 
population density, distance to nearest roads, elevation, 
urban land use, PM2.5 point-source emissions and weather 
variables. 

Estimation of personal exposures of ambient origin 

We assume the true exposure metric is personal expo- 
sures to PM2.5 of ambient origin, which reflect PM2.5 from 
sources relevant to epidemiological studies of ambient air 
pollution [26,44,45]. This quantity cannot be measured 
directly. 

To estimate personal PM2,5 of ambient origin, we used 
ambient SO^" measurements, which were available for 
four cities (Atlanta, Baltimore, Boston and Steubenville). 
The majority of SO^" is formed in the atmosphere 
through secondary reactions via either gas-phase or 
gas/particle phase oxidation [46] and is generally asso- 
ciated with coal combustion and coal-fired power plant 
emissions [47,48]. Because of negligible indoor sources 
and its similar spatial homogeneity as PM2.5, SO^" can 
serve as a tracer for PM2,5 of ambient origin in locations 
where 804" comprises a large part of the PM2.5 mass 
[49,50], with personal to ambient SO^" ratio approximat- 
ing the fraction of ambient PM2.5 that infiltrates indoors 
and remains airborne: 

Papers, of ambient origin — o— ^^ambient 

^ ambient 

In Seattle, for which personal SO^~ data were not avail- 
able, personal PM2.5 of ambient origin was estimated as 
the weighted average of the indoor PM2.5 of ambient ori- 
gin (estimated using the corresponding calculated home 
infiltration efficiency) and ambient PM2.5, with the pro- 
portion of time each subject spent indoors and outdoors 
as weights [51]. 



Since personal exposures to PM2.5 of ambient origin 
could only be estimated in five cities, we also assessed 
error using total personal PM2.5 exposure. For this mea- 
sure, calibration coefficients will be less accurate, since 
total personal PM2.5 exposures also include indoor- and 
personally-generated PM2.5, which are independent from 
ambient PM2.5 [45]. 

Calibration coefficients 

The calibration coefficients were estimated as the fixed 
regression coefficients (yi) from linear mixed effects 
models, of monthly averaged "true" on surrogate expo- 
sures, accounting for within-city correlated observations 
and repeated measures within subject: 

^ijk = (yo +gii +g2ij) + (n +g3i)Zijk + }/2Season,y^ + e,y^, 

(1) 

where Xij/^ are the "true" (either personal PM2.5 of ambi- 
ent origin or total personal PM2.5) and Z/y/^ the surro- 
gate exposures (either nearest ambient PM2.5 monitor 
or spatio-temporal model predictions) for 7=1,- • • ,// 
subjects within city /=1, •••,/, and 1=5 or 9, with 
k=h' ' • J<ij repeated measures, gu ~ A/'(0, a^^.^), g2ij ^ 

•^(0' ^subject)' ^3/ - A/'(0, 0r2p^.^p and Sijj, - A/'(0, a^). 

We explored the sensitivity of our results to assumptions 
about the covariance structure for repeated measures 
within subjects. Results are reported assuming compound 
symmetry covariance, with results similar for autoregres- 
sive covariance structure or when allowing heteroscedas- 
ticity. We also allowed for random seasonal effects by city, 
but our results were materially unchanged (results not 
shown). 

Calibration coefficients equal to 1 suggest no bias, while 
coefficients < 1 suggest an attenuated effect estimate. The 



Kioumourtzoglou etal. Environmental Health 2014, 13:2 
http://www.ehjournal.net/content/1 3/1 /2 



Page 4 of 1 1 



p-values (as p-valuei) presented with the estimated cal- 
ibration coefficients correspond to the hypothesis that 

yi = l and were obtained using ~ Xi • 

Potential effect modification by season, with October- 
March as winter and April-September as summer, was 
assessed, as the association between personal expo- 
sures and ambient concentrations differs by season 
[29,33,37,38]. Stratified calibration coefficients are pre- 
sented when the estimated interaction term for season 
was significant. Statistical significance was assessed at the 
0.05 level. 

Between-city heterogeneity 

To assess potential between-city heterogeneity in the 
calibration coefficients, we tested the hypothesis Hq: 
^CF-citv ~ ^' comparing Model 1 to Model 2, where 
Model 2 is the same as Model 1 without the random slope 
for cities (gsi): 

^ijk = (70 +^1/ -\-g2ij) + YiZijk + 72Season,y^ + sijk (2) 

We used a likelihood ratio test (LRT) for this com- 
parison, with LRT~ 50:50 mixture of Xq Xi 
p-value = 0.5 if <3^^p_city — ^ p-value = 0.5 x(l — xi ^ 
(LRT)) otherwise [52]. 

We used step-wise selection to identify city-specific 
variables explaining any observed between-city hetero- 
geneity in the calibration coefficients. In presence of 
significant heterogeneity, we added to Model 1 candi- 
date city-specific variables together with interaction terms 
between the candidate variable and the surrogate expo- 
sure (Model 3). The candidate variables were kept in the 
model if the interaction term was significant. 

^ijk = (Yo -\- gii -\- g2ij) + (yi -\-g3i)Zijk + y2Season,y/, 
+ y 3 City Variables^- + y4Zjry^CityVariables^- + siji^ 

(3) 

Candidate city-specific variables were identified 
from previous studies showing their importance to the 
personal-ambient relationship, including air condition- 
ing use, unemployment, race, public transport [53-55] 
and traffic [54] (Additional file 1: Table S4). City-specific 
variables were obtained from the U.S. Census Bureau 
(Census 2000, www.census.gov), the American Housing 
Survey (www.census.gov/programs-surveys/ahs/), the 
National Climatic Data Center (www.ncdc.noaa.gov) and 
the Bureau of Labor Statistics (www.bls.gov). 

Leave-one-out cross-validation techniques were 
employed to validate the variable selection process 
[56, Chapter 7.10]. By omitting one city at a time (— /), we 
re-fit Model 3, using data from the remaining 7—1 cities, 
allowing for a different set of variables to be selected 
each time. We then predicted the city-specific calibration 
coefficient for the omitted city using the estimated model 



parameters together with the selected variable (s) of the 
omitted city, i.e. yu- = yi(-i) + y4(_/) City Variables^-. We 
also estimated city-specific calibration coefficients (yu) 
employing city- specific mixed effects models (Model 4). 
Finally, we compared the predicted to the observed 
city-specific calibration coefficients obtained from the 
city-specific models. 

Xijk = (yoi +g2ij) + yiiZijk + y2/Season,y^ + Sijj, (4) 

We assessed the cross-validated results by the correla- 
tion between the predicted (yu-) and observed (yu) cali- 
bration factors, the relative bias and the absolute 

YU 

bias I I both averaged over all cities. 

Sensitivity analyses 

To assess the robustness of our results, we assessed poten- 
tial effect modification by subpopulation: seniors (subjects 
older than 65 years old) and subjects with COPD, myocar- 
dial infarction (MI), and coronary heart disease (CHD). 

Sensitivity analyses were also performed to assess the 
effect of imperfectly matched monthly ambient and per- 
sonal exposures. We calculated calibration coefficients 
for monthly ambient levels estimated using only those 
days for which personal exposure measures were available. 
Since the EPA does not collect data daily at all locations, 
we allowed subjects to be matched to the nearest monitor 
with available data for that day. This sensitivity analysis 
could only be performed for the nearest ambient monitor 
concentrations, as the outdoor home model predictions 
were calculated at the monthly level only. 

In addition, we calculated calibration coefficients for 
total personal PM2,5 exposures using the identical data 
as used to calculate calibration coefficients for personal 
PM2.5 of ambient origin. 

All statistical analyses were conducted using SAS soft- 
ware (Version 9.3, SAS Institute Inc, Cary, NC). 

Results 

Summary statistics and ambient-personal correlations are 
presented in Table 2 and Additional file 1: Table S2, 
respectively. By-city summary statistics are presented in 
Additional file 1: Table SI, and the relationship between 
exposure to PM2.5 of ambient origin and ambient PM2.5 
concentrations is presented in Additional file 1: Figure 
SI. On average, total personal PM2,5 was higher than 
both concentrations at the nearest ambient monitor and 
outdoor home predictions. Concentrations at the ambi- 
ent monitors were strongly correlated with outdoor home 
model predictions (Spearman rs = 0.86). PM2.5 of ambi- 
ent origin contributed 62%, on average, to the total per- 
sonal PM2.5. 
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Table 2 Bask characteristics of exposure variables 

# person-months Mean ± SD 
(# subjects) ifig/m^) 



Table 3 Season-adjusted calibration factors for personal 
PM2.5 of ambient origin and total personal PM2.5 

Monitor PM2.5 Model predicted PM2.5 



All Year 

Total Personal PM2.5 
Personal PM2.5 of ambient origin 
Personal/Ambient S04~ ratio^ 
Model Predicted PM2.5 
Monitor PM2.5 

Summer 

Total Personal PM2.5 
Personal PM2.5 of ambient origin 
Personal/Ambient S04~ ratio^ 
Model Predicted PM2.5 
Monitor PM2.5 

Winter 

Total Personal PM2.5 
Personal PM2.5 of ambient origin 
Personal/Ambient $04" ratio^ 
Model Predicted PM2.5 
Monitor PM2.5 



919(490) 
261 (141) 
241 (131) 
1029 (502) 
1029 (502) 

429 (312) 

130 (90) 
1 25 (86) 

493 (327) 
493 (327) 

490 (353) 

131 (97) 
116(87) 

536 (367) 
536 (367) 



24.54 ± 18.97 
9.71 ±4.32 
0.64 ± 0.25 
1 5.47 ± 4.77 
15.86 ±5.58 

23.92 ± 16.92 
10.95 ±4.12 
0.70 ± 0.23 
1 5.67 ± 4.84 
1 5.69 ± 4.79 

23.94 ± 19.67 
8.73 ±4.27 
0.59 ± 0.24 
15.36 ±4.71 
16.10 ±6.21 



^Not estimated for Seattle, WA. 



Calibration coefficients 

The results from the linear mixed effects model (Model 1) 
for both personal PM2,5 of ambient origin and total per- 
sonal PM2,5 are presented in Table 3. 

When the nearest ambient monitor was used as the 
surrogate exposure, the calibration coefficient for per- 
sonal PM2.5 of ambient origin was estimated as 0.31 
((95% CI:0T4, 0.47), p-valuei <0.0001), when adjusted for 
seasonal effects. We found no significant seasonal effect 
modification (p-value = 0.71). The season-adjusted cali- 
bration coefficient was higher for outdoor home model 
predictions, as compared to nearest monitor PM2.5, equal- 
ing 0.54 (95% CI:0.42, 0.65, p-valuei <0.0001). We found 
significant effect modification by season for outdoor 
home model predictions (p-value = 0.006), with season- 
stratified calibration coefficients higher during winter 
(0.60 (95% CI:0.36, 0.64)) than summer (0.50 (95% CI:0.42, 
0.78)). 

Total personal PM2.5 exposure calibration coefficients 
were higher than those for personal PM2.5 of ambi- 
ent origin (Table 3). For total personal PM2.5 expo- 
sures, the season-adjusted calibration coefficient for the 
nearest ambient monitor was 0.56 (95% CI:0.24, 0.88, 
p-valuei = 0.007). Effect modification by season was sig- 
nificant (p-value = 0.041), with higher season-stratified 



Personal PM2.5 of ambient 
origin 

Estimate (95% Cl)^ 

p-value for between-city 
heterogeneity 



Estimate (95% Cl)^ 

Total Personal PM25 

Estimate (95% CI)" 

p-value for between-city 
heterogeneity 

Estimate (95% Cl)^ 



5 cities (141 subjects) 



0.31 (0.14,0.47)^ 
0.0034 

10.75% 
N/A 



0.54 (0.42, 0.65)^ 
0.1114 

0.96% 
0.56 (0.44, 0.68)^ 



9 cities (490 subjects) 
0.56(0.24,0.88)^^ 0.81 (0.49, 1.12) 



0.0084 

2.54% 

N/A 



0.1712 



1.; 



0.79 (0.54, 1 .04)^ 



*p-valuei < 0.05, **p-valuei < 0.01 for significant difference from 1 . 
^Results from Model 1 (including random slopes for cities, gs,). 
'^R/ithe proportion of variance explained by the between-cities heterogeneity. 
■^Results from Model 2, when no significant between-city heterogeneity was 
detected (without the random slopes for cities, gs,). 



calibration coefficients during summer (0.78 (95% CI:0.36, 
1.19)) than winter (0.48 (95% CI:0.12, 0.83)). The cor- 
responding calibration coefficient, using outdoor home 
model predicted PM2.5 as the surrogate exposure, was 
higher, 0.81 (0.49, 1.12, p-valuei = 0.234). There was no 
significant seasonal effect modification. 

Between-city heterogeneity 

For both personal PM2,5 of ambient origin and total per- 
sonal PM2,5 calibration coefficients, we found no statis- 
tically significant evidence of heterogeneity across cities 
for outdoor home model predictions (p-values = 0.11 and 
0.17, respectively) and therefore results from Model 2, 
instead of Model 1, can be used. For personal PM2,5 of 
ambient origin and total personal PM2.5, calibration coef- 
ficients equaled 0.56 (0.44, 0.68) and 0.79 (0.54, 1.04), 
respectively. Since no between-city heterogeneity was 
detected, no further adjustment to these calibration coef- 
ficients was done. 

Significant between-city heterogeneity (p-value = 0.003) 
was detected in the calibration coefficients for personal 
PM2.5 of ambient origin, when the nearest monitor was 
used as the surrogate exposure, with estimated city- 
specific calibration coefficients ranging between 0.0-0.71 
(Figure 1(a)). The observed between-city heterogeneity 
was explained by two variables: the city's average number 
of residents in a housing unit and the city's 30-year average 
of annual heating degree days, an indicator of the typical 
number of heating days in a year (p-value = 0.50 for the 
test for residual heterogeneity). Cross-validation showed, 
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(^) Baltimore 
Boston 
Steubenville 
03 Atlanta 
Seattle 



o 



Summary 



-0.2 0.0 0.2 0.4 0.6 0.8 
Calibration Coefficients 



1.0 



(b) 



O 



Baltimore 

Boston 

Steubenville 

Atlanta 

Los Angeles 

Seattle 

RTP 

L.A. (R) 

Elizabeth (R) 

Houston (R) 



Summary 



-1.5 -0.5 0.5 1.5 
Calibration Coefficients 



2.5 



Figure 1 Forest plots of the by-city calibration coefficients for (a) 
personal PIVI2.5 of ambient origin and (b) total personal PIVI2.5 
and nearest monitor concentrations. The size of the point used for 
the effect estimate is proportional to the precision of that calibration 
coefficient. 



however, that these variables were not robust predictors 
of the between-city variation in the calibration coefficient 
(Additional file 1: Figure S2). 



Significant between-city heterogeneity in the calibra- 
tion coefficient was also detected for total personal 
PM2.5 when the nearest monitor was used as the sur- 
rogate measure (p-value = 0.008). Using Model 4, esti- 
mated city- specific coefficients ranged between 0.0-1.78 
(Figure 1(b)). Step-wise selection found that some of the 
observed between-city heterogeneity was explained by 
the average number of vehicles per housing unit in each 
city (p-value = 0.221 for the test for residual heterogene- 
ity). The effect of the city average vehicles per housing 
unit on the relationship between total personal PM2,5 
and nearest ambient monitor PM2.5 concentrations was 
-2.53 (SE: 0.82), implying that as the average number of 
vehicles per housing unit increases, the calibration coeffi- 
cient decreases for cities with larger numbers of vehicles 
per housing unit. For instance, if the average number of 
vehicles per housing unit in a city increased by 0.1, then 
the calibration coefficient for that city would decrease by 
0.25. The selection of this variable was confirmed in the 
cross-validation, as it was consistently selected when cities 
were omitted one by one (Additional file 1: Figure S2). 
The correlation between the predicted calibration coeffi- 
cients from each city and the observed by-city coefficients 
was 0.62 (p-value = 0.05), the mean percent relative bias 
was estimated -0.76% and the mean percent absolute bias 
149%. 

Sensitivity analyses 

Results from our sensitivity analyses are presented in 
the Additional file 1. Briefly, we observed no significant 
effect modification by subpopulation. We found signif- 
icant effect modification by age, with subjects younger 
than 65 years of age having lower calibration coeffi- 
cients than their older counterparts (Additional file 1: 
Table S3). 

Further, we found that estimated calibration coefficients 
were similar irrespective of the method used to calculate 
monthly ambient concentrations at the nearest monitor. 
When all days within the month were used in the cal- 
culation, the calibration coefficient for personal PM2.5 
of ambient origin was 0.31 (95% CI:0.14, 0.47), vs. 0.35 
(95% CI:0.26, 0.43) when monthly ambient concentra- 
tions were calculated using only those days with personal 
monitoring. 

Discussion 

We estimated calibration coefficients for studies of the 
association of long-term PM2.5 health effects with ambi- 
ent air pollution exposures, considering both estimated 
personal exposures to PM2.5 of ambient origin as the 
exposure metric and personal exposures to total PM2.5 
as a second, albeit imperfect, exposure metric. Our goal 
was to assess and quantify error resulting from use 
of surrogate exposures and characterize the impact of 
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different surrogate exposures on error. As discussed in the 
introduction, nevertheless, the estimated error could be 
from a variety of sources, and it has been argued that not 
all of these are properly characterized as measurement 
error [19]. 

Using estimated monthly personal PM2.5 of ambient 
origin from five cities as the true exposure measure, we 
estimated a calibration coefficient of 0.54 (95% CI:0.42, 
0.65) when outdoor home model predictions were used 
as the surrogate exposure, with no city-specific hetero- 
geneity. This calibration coefficient suggests that when 
the parameter of interest is the health effect of ambi- 
ent source pollution, the observed effect could be half 
the true estimate when outdoor home model predictions 
are used as the exposure metric in a linear health model, 
in absence of other potential bias sources. The lack of 
observed between-city variability likely reflects the use of 
the spatio-temporal model, which incorporates variables 
that may explain much of the between-city variability, 
such as population density, urban land use and distance to 
nearest road. 

The estimated calibration coefficient for nearest ambi- 
ent monitor concentrations as the exposure metric 
was lower (0.31 (95% CI:0.14, 0.47) compared to 0.54 
(95% CI:0.42, 0.65) for outdoor model concentrations), 
reflecting the fact that nearest monitor concentrations 
do not account for as much spatial variability in ambient 
concentrations as the outdoor home model predictions. 
We also detected statistically significant between-city het- 
erogeneity. Factors explaining between-city variability in 
the calibration coefficient, nevertheless, could not be reli- 
ably identified. This inability to explain the city-specific 
heterogeneity likely reflects the small number of cities 
included in our analysis. 

When total PM2.5 was used as the true exposure mea- 
sure, calibration coefficients of 0.56 (95% CI:0.24, 0.88) 
and 0.81 (95% CI:0.49, 1.12) were found for nearest ambi- 
ent monitor PM2.5 and outdoor home model predic- 
tions, respectively. These results are consistent with those 
reported in Setton et al. (2011) [15], who reported an 
attenuation ranging between 0.70 to 0.84 for scenarios 
when mobility was not considered and only PM2.5 pre- 
dictions at the subjects' residences were included in the 
health model. As noted above, however, these calibration 
coefficients were calculated using total personal PM2.5, an 
imperfect measure of true exposure to ambient-generated 
pollutants. 

As was the case with personal PM2.5 of ambient ori- 
gin, we detected significant between-city heterogeneity 
in total PM2.5 calibration coefficients only when near- 
est monitor concentrations were used as the surro- 
gate exposure. For nearest monitor PM2.5, between-city 
heterogeneity was explained with the city average num- 
ber of vehicles per housing unit. Results showed that 



error increases with vehicles per housing unit. A possible 
explanation for this association is provided by the strong 
negative correlation between the number of average vehi- 
cles per housing unit and population density (r = —0.86) 
and the strong positive correlation with the percentage of 
the detached homes in a study area (r = 0.88) as shown 
in Additional file 1: Figure S3. These correlations suggest 
that in less dense cities, residents need to travel longer dis- 
tances, possibly increasing the impact of pollutant spatial 
variability. These results are also in agreement with Setton 
et al. (2011) [15], who found increasing bias with increas- 
ing distance spent away from home. Selection of number 
of vehicles per housing unit to explain between-city het- 
erogeneity could also reflect varying PM2.5 composition, 
with local sources, such as traffic, likely comprising a 
larger portion of PM2,5 mass in cities with more vehi- 
cles per housing unit, than regional sources. PM2.5 of 
local sources is more spatially heterogeneous and more 
error is, therefore, expected when it comprises a large 
fraction of the total ambient PM2.5. The fact, however, 
that our estimated city-specific calibration coefficients 
ranged between 0.0-1.9 complicates our interpretation of 
the overall estimate of 0.56 (95% CI:0.24, 0.88) and of 
the observed association with housing and transporta- 
tion characteristics, suggesting that one average calibra- 
tion coefficient may not adequately describe error from 
use of ambient monitor measurements across the United 
States. 

Environmental tobacco smoke (ETS) may also con- 
tribute, at least partially, to the observed between-city 
heterogeneity. In all studies in our analyses, subjects were 
selected as non-smokers, living in non-smoking homes. 
Although this inclusion criterion would minimize poten- 
tial exposure to ETS, it is possible that participants living 
in cities with more ETS would also have higher per- 
sonal PM2.5 exposures, thereby potentially contributing to 
between-city heterogeneity in the calibration coefficients. 
We, however, were not able incorporate ETS exposures in 
our analysis, as some studies did not report ETS exposure 
information. 

Our findings are consistent with two studies by Avery 
et al. (2010) [57], who found a median correlation coeffi- 
cient of 0.54 between total personal PM2.5 exposures and 
concentrations at a centrally located monitor, and strong 
between-city heterogeneity (p-value<0.0001). Although 
their reported median correlation coefficient between 
total personal PM2.5 exposures and outdoor home con- 
centrations was similar, between-city heterogeneity in 
this association was lower (p-value = 0.05). The weaker 
evidence of heterogeneity for outdoor home PM2.5 
concentrations is consistent with our suggestion that 
between-city heterogeneity in calibration coefficients is 
explained by variables included in outdoor home model 
predictions; this is one explanation for why we found 
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heterogeneity only for nearest monitor but not outdoor 
home exposures. 

Our study is limited by several factors. First, the data 
available to validate the exposure metrics of interest were 
limited to a small number of cities and participants, espe- 
cially for personal PM2.5 of ambient origin. Also, only a 
small number of days in a month were available in some 
cities to estimate monthly averages. These small numbers 
contributed to uncertainty in our data and estimates, and 
potentially prohibited detection of any potential between- 
city heterogeneity for outdoor home predictions and the 
identification of factors explaining observed between-city 
heterogeneity in calibration coefficients when the nearest 
monitor PM2.5 concentrations were the exposure surro- 
gate. Further, the cities included in our analyses may 
not be representative of all US cities, and thus our esti- 
mated calibration coefficients might not be generalizable 
to other cities. Moreover, the association between per- 
sonal exposures and ambient concentrations might vary 
over years. Since our studies were conducted over a one 
to two year time span (Table 1), we were not able to assess 
the contribution of longer term personal-ambient trends 
to total error. 

In addition, personal PM2.5 of ambient origin was esti- 
mated rather than measured. As a result, estimated expo- 
sures did not take into account the uncertainty related to 
their prediction when estimating the calibration coeffi- 
cients. Moreover, given data availability, we were not able 
to estimate the contribution of instrumental to total error. 
Both personal and ambient measurements are prone to 
instrumental error, presence of which is likely to intro- 
duce classical error [5]. In our setting, however, personal 
exposures are the outcome variable in the regression and 
therefore random error in these exposures is not expected 
to introduce error in the estimated calibration coeffi- 
cients. Furthermore, personal exposures are on average 
measured with high precision and accuracy [29,30]. 

To estimate personal PM2.5 of ambient origin we used 
the SO^" tracer method. In cities where SO^" com- 
prises a large fraction of the total ambient PM2.5 mass, 
as in the northeastern US [58], the S04~ tracer method 
has been shown to perform well [49]. In places, how- 
ever, where ambient PM2.5 mass is strongly influenced 
by local sources, such as traffic, ambient SO^" would 
not act as good tracer, given that the spatial and size 
distributions of SO^" may differ from those of PM2.5. 
Since PM2.5 from local sources is more spatially hetero- 
geneous, larger spatial misalignment would be expected 
in these cities and, hence, more measurement error. For 
these cities, we would expect the calibration coefficients 
for personal PM2.5 of ambient origin, which was esti- 
mated using the S04~ ratio, to be overestimated and the 
error to be underestimated, a factor likely contributing to 
the observed between-city heterogeneity. In our study, we 



only had SO^" data in four cities, three of which are in 
the northeastern US (Baltimore, Boston and Steubenville). 
The fourth city was Atlanta, which has been shown, on 
average, to have lower SO^~ concentrations [58]. Even 
there, however, secondary sulfate was found to comprise 
38% of the total PM2.5 mass [59] and in our data, the 
ratio of ambient SO^" over PM2,5 in Atlanta was, on aver- 
age, similar to the ratios in the three northeastern cities 
(Additional file 1: Table SI). 

In addition, we estimated the outdoor home predictions 
using a specific spatio-temporal model. This model has 
been validated and shown to perform very well [14,43]. 
We would therefore expect that our findings for outdoor 
home predictions could be extended to similarly perform- 
ing spatio-temporal models and could be qualitatively 
used for predicted concentrations obtained from other 
spatio-temporal models. 

Moreover, we were not able to disentangle how spe- 
cific error types would impact the health effect estimates 
obtained using either of the surrogate exposures. We did 
not assume models addressing specific error structures 
and our approach assesses overall error from use of sur- 
rogate exposures, combining the multiple error types that 
are likely present [5,18]. 

Furthermore, our study is not able to determine how 
much of the estimated calibration coefficient reflects infil- 
tration of particles from outdoor to indoor environments, 
as compared to other sources of the difference between 
personal exposure and outdoor concentration metrics 
[5,19]. Infiltration, however, does not appear to explain all 
of the observed error found in our analysis, since the aver- 
age estimated calibration coefficients for personal PM2,5 
of ambient origin were <0.64 (the approximated pene- 
tration efficiency using the SO^" ratio), consistent with 
additional contributing error sources. 

Additionally, personal exposures were measured for 
each participant for periods less than one month. We 
would expect this temporal mismatch to introduce both 
Berks on, through the errors in the true exposures that 
were randomly selected within a month, and classical, 
through the errors in the temporal misalignment of the 
surrogate exposures, error components. Through sen- 
sitivity analyses, comparing PM2.5 concentrations mea- 
sured at the nearest monitor using all data within a month 
with that measured on days when personal data were 
also available, we showed the point estimates to be very 
similar, but the confidence intervals for the calibration 
coefficients estimated using the temporally mismatched 
data were wider. Since outdoor home model predictions 
were only available at the monthly level, we were unable to 
quantitatively assess the effect of this temporal mismatch 
on the estimation of the calibration coefficients. Monthly 
concentrations at the nearest ambient monitor, however, 
were very strongly correlated with outdoor home model 
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predictions (r = 0.86). In any event, randomly temporally 
mismatched data relating personal exposures to outdoor 
home predictions may also lead to increased uncertainty, 
but likely no bias, in the calibration coefficients. 

Finally, our goal was to assess exposure measurement 
error in long-term PM2.5 exposures. As described earlier, 
personal exposure studies are infeasible for long periods 
and, given current data availability, we were only able 
to conduct our analyses using monthly averages. Many 
long-term PM2,5 studies use exposure metrics based on 
functions of monthly averages (e.g. 12-month moving 
average [11] or cumulatively- up dated monthly average 
[60]), and we therefore believe that our findings provide 
useful information in the interpretation of chronic health 
effect estimates. 

We compiled data from 9 cities across the United States 
for our analyses and calculated calibration coefficients 
that may be informative for interpreting risk estimates 
in nationwide studies of long-term PM2.5 health effects. 
For instance, differential measurement error could be par- 
tially responsible for the higher effects reported by Puett 
et al. (2009) [11], who used PM2.5 predictions outside the 
participant s homes, as compared to the effects found by 
Krewski et al. (2005) [61], who used metropolitan area 
means of PM2.5 concentrations at ambient monitors. 

To our knowledge, this is the first study to assess error 
due to two different, widely used, surrogate exposures, 
using personal exposure data from multiple US cities. Fur- 
ther, we identified variables explaining the heterogeneity 
in the calibration coefficients across cities, with the vari- 
ances of the reported calibration coefficients potentially 
reflecting this heterogeneity. 

At this time, we do not recommend using the calibration 
coefficients reported here to directly adjust health effect 
estimates in epidemiology studies. Given the observed 
between-city heterogeneity, the complex, time-varying 
nature of the exposures and the lack of information on 
individual characteristics, which would be included as 
confounders in health models, standard error correction 
methods such as ordinary regression calibration could 
still yield biased estimates [62,63]. Our group is currently 
developing methods to account for the above limita- 
tions in order to correctly adjust health effect estimates 
obtained using surrogate exposures. Furthermore, future 
research on PM2.5 -related measurement error should 
characterize measurement error for regional and local 
PM2.5 by focusing on PM2.5 composition, which changes 
both over space and time, suggesting that calibration coef- 
ficients will also change over space and time [6,8,48]. 

Conclusions 

With our study we were able to assess the ability of 
two widely used surrogate exposures to reflect personal 
exposures: ambient concentrations measured at centrally 



located monitors, as well as outdoor home predictions. 
Our estimated calibration coefficients are consistent with 
previously reported chronic health risks using nearest 
monitor exposures being under-estimated when ambient 
concentrations were the exposure of interest. For outdoor 
home predictions, our results suggest less error. 

Additional file 
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